184 research outputs found

    A Comparative Analysis of Ensemble Classifiers: Case Studies in Genomics

    Full text link
    The combination of multiple classifiers using ensemble methods is increasingly important for making progress in a variety of difficult prediction problems. We present a comparative analysis of several ensemble methods through two case studies in genomics, namely the prediction of genetic interactions and protein functions, to demonstrate their efficacy on real-world datasets and draw useful conclusions about their behavior. These methods include simple aggregation, meta-learning, cluster-based meta-learning, and ensemble selection using heterogeneous classifiers trained on resampled data to improve the diversity of their predictions. We present a detailed analysis of these methods across 4 genomics datasets and find the best of these methods offer statistically significant improvements over the state of the art in their respective domains. In addition, we establish a novel connection between ensemble selection and meta-learning, demonstrating how both of these disparate methods establish a balance between ensemble diversity and performance.Comment: 10 pages, 3 figures, 8 tables, to appear in Proceedings of the 2013 International Conference on Data Minin

    Structural Drift: The Population Dynamics of Sequential Learning

    Get PDF
    We introduce a theory of sequential causal inference in which learners in a chain estimate a structural model from their upstream teacher and then pass samples from the model to their downstream student. It extends the population dynamics of genetic drift, recasting Kimura's selectively neutral theory as a special case of a generalized drift process using structured populations with memory. We examine the diffusion and fixation properties of several drift processes and propose applications to learning, inference, and evolution. We also demonstrate how the organization of drift process space controls fidelity, facilitates innovations, and leads to information loss in sequential learning with and without memory.Comment: 15 pages, 9 figures; http://csc.ucdavis.edu/~cmg/compmech/pubs/sdrift.ht

    Genetic Algorithm Amplifier Biasing System (GAABS): Genetic Algorithm for Biasing on Differential Analog Amplifiers

    Get PDF
    Genetic Algorithm Amplifier Biasing System (GAABS) - Senior Project Analysis Summary of Functional Requirements This project integrates LTSpice with a python script that runs a genetic algorithm to bias a differential amplifier. The system biases the amplifier with 2 different voltages, the base voltage for the PNP BJTs of the active loads and a voltage controlling the current of the current sink. The project runs via a python script, gets data from LTSpice’s command line call, and iteratively runs until the system is biased to achieve the greatest gain on an arbitrary input voltage. Primary Constraints Some of the main challenges associated with this project are going to be the getting the genetic algorithm to work consistently and getting LTSpice to integrate well with command line. The genetic algorithm, though controlled, will have a good deal of randomness involved with converging to a certain gain value. A strong genetic algorithm should be able to converge to the same value every time and should be designed accordingly. Having never experienced using LTSpice via command line, but it shouldn’t be too difficult to call. Collecting data from the simulation will be challenging, but ideally there would be resources for help on that portion. Economic The original estimated cost for components is 0,asallthesoftwareshouldbeopensourceandfreetodownloadandaccesstoacomputershouldbeconsideredfree.Thereisnohardware,asit’sallsimulation,sothereisnothingtheretobepurchased.BillofMaterialsItemCostBootcampApplication0, as all the software should be open source and free to download and access to a computer should be considered free. There is no hardware, as it’s all simulation, so there is nothing there to be purchased. Bill of Materials Item Cost Bootcamp Application 0 Python 2.7 0LTSpice0 LTSpice 0 Total 0Thetotaldidendupbeing0 The total did end up being 0 as anticipated. Everything that could be downloaded was free to download. The original time for development at the start of the project was anticipated being 100+ hours. Given the need to integrate everything and work to get the genetic algorithm working well, 100 hours seemed reasonable. In the end, it did end up taking roughly 80 hours. Having to try different approaches to the problem took up a lot of time and tweaking the genetic algorithm (and running the tests) took a long time, but the integration was easy to set up. The integration being easy shaved a large chunk of time off the projected time to complete the project. Manufacturing Information This code is open source on GitHub, and won’t be manufactured on a commercial basis. Environmental There are no environmental impacts associated with manufacturing. The only potential impact on the environment of this project would be the heat generated by a computer running the script. The script takes up to 30+ minutes to run, and it is somewhat intensive in terms of computing power; this would generate heat from the computer running it, and heat from computers cannot be neglected in terms of their effect on global warming. However, the heat that would be generated by 1 computer should be considered negligible, as there are much greater contributors. Manufacturability As stated before, there are no issue with manufacturing this project because it’s open source. Everything needed to run the code can be found online for free download, and the script can be taken from online. Sustainability The code runs on Python 2.7 and the current version of LTSpice. It should have no issue running on later versions of Python and LTSpice, so long as there are no drastic changes. The project is on the internet, and so it will be sustainably existing as long as it’s not taken down by GitHub. Upgrades that would improve the design of the project include running more children per generation in simulation at once to speed up runtime and taking more generations to come to the best bias voltages to make it more accurate. Ethical There is no ethical implication to the use or design of this project. Health and Safety Other than long term computer use’s impact on a user, there are no health and safety concerns with this project whatsoever. Social and Political There are no social and political implications to the use or design of this project. Development During the development of this project, I had to learn how to use Python on a much deeper level. My CPE 101 class was in Python, but that was winter quarter of 2015, and this project took place in the winter and spring of 2018. I remembered very little, but I got to see a lot of the functionality of python in terms of it being a great language for running scripts to work on a variety of applications across platforms. I had to research a lot on genetic algorithms and how to implement them, as that was a huge portion of this project

    Verbal mediation of visual memory on the Continuous Visual Memory Test

    Get PDF

    Observability and Controllability of Nonlinear Networks: The Role of Symmetry

    Full text link
    Observability and controllability are essential concepts to the design of predictive observer models and feedback controllers of networked systems. For example, noncontrollable mathematical models of real systems have subspaces that influence model behavior, but cannot be controlled by an input. Such subspaces can be difficult to determine in complex nonlinear networks. Since almost all of the present theory was developed for linear networks without symmetries, here we present a numerical and group representational framework, to quantify the observability and controllability of nonlinear networks with explicit symmetries that shows the connection between symmetries and nonlinear measures of observability and controllability. We numerically observe and theoretically predict that not all symmetries have the same effect on network observation and control. Our analysis shows that the presence of symmetry in a network may decrease observability and controllability, although networks containing only rotational symmetries remain controllable and observable. These results alter our view of the nature of observability and controllability in complex networks, change our understanding of structural controllability, and affect the design of mathematical models to observe and control such networks.Comment: 19 pages, 9 figure

    Model Aggregation for Distributed Content Anomaly Detection

    Get PDF
    Cloud computing offers a scalable, low-cost, and resilient platform for critical applications. Securing these applications against attacks targeting unknown vulnerabilities is an unsolved challenge. Network anomaly detection addresses such zero-day attacks by modeling attributes of attack-free application traffic and raising alerts when new traffic deviates from this model. Content anomaly detection (CAD) is a variant of this approach that models the payloads of such traffic instead of higher level attributes. Zero-day attacks then appear as outliers to properly trained CAD sensors. In the past, CAD was unsuited to cloud environments due to the relative overhead of content inspection and the dynamic routing of content paths to geographically diverse sites. We challenge this notion and introduce new methods for efficiently aggregating content models to enable scalable CAD in dynamically-pathed environments such as the cloud. These methods eliminate the need to exchange raw content, drastically reduce network and CPU overhead, and offer varying levels of content privacy. We perform a comparative analysis of our methods using Random Forest, Logistic Regression, and Bloom Filter-based classifiers for operation in the cloud or other distributed settings such as wireless sensor networks. We find that content model aggregation offers statistically significant improvements over non-aggregate models with minimal overhead, and that distributed and non-distributed CAD have statistically indistinguishable performance. Thus, these methods enable the practical deployment of accurate CAD sensors in a distributed attack detection infrastructure

    Improving Critical Speed Calculations Using Flexible Bearing Support FRF Compliance Data.

    Get PDF
    LecturePg. 69-78The importance of including flexible supports in rotordynamic analyses is discussed. Various methods of including the support in rotordynamic calculations are reviewed. A method is described in which actual compliance frequency response function, FRF, data are used directly in a rotordynamic forced response computer program to accurately predict a steam turbine rotor's critical speed. The flexible support model is described as two single degree of freedom, SDOF, spring-mass-damper systems per bearing support. The methodology of acquiring the FRF data via impact hammer testing is described, and the equations are summarized that incorporate the FRF data into the flexible support model. Three flexible support models of increasing sophistication are used to analytically predict the rotor and support resonances. These results are compared to the actual steam turbine speed-amplitude plots. Modelling the support as many speed dependent SDOF systems accurately predicts the location of the rotor's first critical speed and also the split critical peaks and several support resonance speeds
    • …
    corecore